Data Science Applied to Ag - Final Project - Design

Author

Ali Pirhadi Tavandashti & Swabir Alhassan Musah

1 Instructions

This file contains both the instructions for the mid-term project and placeholders for your code. You are required to use this file to produce code, output, and answers to the questions below.

Besides simply creating output, make sure to interpret the output. You will need to create tables and/or plots to arrive at the answers, and then comment on what you found below it.

To get you setup, you will need to:

  • Student #1: create a repository on your GitHub account. You can call this repository “2025dsa_finalproject_groupX”, where X is the number of your group. Make it public, add a README, add a .gitignore for R.
  • Student #1: follow the steps we did in class to start a new RStudio project with version control.
  • Student #1: in your computer, create the sub-folders code, data, output, and move your data set into the data folder. Also, student 1 moves this current script into the code folder. Do a git stage, commit, push.
  • Student #1: on GitHub, go the repository settings and invite your partner to be a collaborator in the repository. That will give them push permission.
  • Now, both students should clone this repository on their computers like we did in class. Make sure this step works well and that you can pull and push from GitHub.
  • Student 2, after cloning, does a git pull to get all these updates on their computer.
  • Student 1 and 2 work together to update the README file. README files should explain what the repository is about, the goals of that project, who is working in it, and any other important details you may find.

2 Introduction

Describe here the introduction of your problem. Consider this as a shortened version of your paper, where you will briefly discuss in 3-4 paragraphs what is the issue/gap in literature, and how the data you collected will help answer this gap.

Aflatoxin contamination in stored peanuts poses a serious threat to both food safety and economic sustainability in the agricultural sector. These toxic metabolites, primarily produced by Aspergillus flavus, thrive under specific environmental conditions—namely high humidity and temperature. Traditional storage systems often fail to maintain these parameters within safe limits, especially in bulk conditions, resulting in product degradation and health hazards. While previous studies have explored general storage recommendations and basic environmental controls, there remains a significant gap in understanding the dynamic interaction of airflow, relative humidity, and temperature in large-scale peanut storage units such as metal columns or breathable tote bags. More importantly, there is limited application of high-resolution Computational Fluid Dynamics (CFD) modeling validated by real-world experimental data in this context. This restricts our ability to predict and manage localized microclimates that may contribute to aflatoxin proliferation. To address this gap, we developed a 3D CFD model and collected experimental data on pressure, airflow velocity, temperature, and humidity at multiple points inside a peanut-filled column. The dataset includes peanuts and varying environmental conditions, allowing for statistical and physical analysis of these interacting parameters. By applying ANOVA and other statistical techniques, we aim to identify the most influential factors on airflow resistance and environmental control. This research bridges the experimental and simulation domains, contributing validated data that can inform future designs of storage units and environmental controls. Ultimately, the findings will support better post-harvest management practices to reduce aflatoxin risk and maintain peanut quality during storage and transport.

Test Chamber

3 Hypothesis and objectives

Describe here your hypothesis, followed by your objectives. Make sure your hypothesis are testable and bold, and objectives are clear.

Hypothesis: The interaction between temperature and relative humidity will reduce the internal microclimate of peanut storage units which is determined by the pressure distribution in the storage units, contributing to conditions that may favor aflatoxin contamination.

Objectives: To assess the individual and interactive effects of temperature and relative humidity on the internal climate of peanut storage units using experimental and statistical approaches (e.g., ANOVA).

4 Material and Methods

Describe here your overall material and methods as it pertains to the analysis you will conduct, including study description, site/setup description, what equipment was used, etc. just like you would in a paper. Make sure to clearly explain what was measured and how.

Study Overview: This study was designed to investigate the influence of temperature and relative humidity on airflow resistance and pressure distribution in a peanut storage system. A combination of experimental measurements and statistical analysis was employed to quantify how environmental variables affect the internal conditions of shelled peanut storage, particularly focusing on airflow dynamics and pressure variations across vertical levels within the column.

Experimental Site and Setup: The experimental setup consisted of a vertically oriented metal storage column with a total height of 54 inches (1.3716 meters), of which 48 inches (1.2192 meters) was filled with shelled peanuts. The column had a diameter of 14 inches (0.3556 meters) and included a perforated sheet at the base to facilitate uniform airflow distribution. Ambient air was introduced from the bottom of the column using a controlled airflow system at a fixed velocity of 0.1 m/s. The airflow was allowed to pass vertically through the peanut mass, simulating real storage ventilation conditions. Peanuts were conditioned to different environmental conditions to represent various storage scenarios. Test Chamber

Measurement and Instrumentation: Pressure was measured at five vertical levels (Taps 1 through 5), with Tap 1 located just above the perforated sheet (at the base), and Tap 5 located near the top of the peanut mass. Static pressure sensors were used to collect time-series data under each moisture condition. Air temperature and relative humidity were also recorded at each tap using calibrated digital sensors. The physical properties of the peanuts—such as bulk density (704.86 kg/m³), true density (997.36 kg/m³), and porosity (36.91%) were determined experimentally prior to the airflow tests to ensure accurate interpretation of airflow resistance.

4.1 Study design

Clearly describe your study design here, including treatment design (which factors and levels, the hierarchy among them, etc.), and your experimental design (number of reps/blocks, how was randomization performed, etc.), as we talked about in class.

This study was conducted to evaluate the effects of relative humidity and temperature on pressure distribution within a peanut-filled storage column. A factorial treatment design was employed, consisting of two fixed factors: . Relative Humidity (RH): 7 levels (50%, 55%, 60%, 65%, 70%, 75%, 80%) . Temperature (T): 3 levels (5°C, 10°C, 15°C) This results in a total of 21 unique treatment combinations (7 humidity × 3 temperature). Each treatment was replicated 5 times, resulting in 105 total observations.

Experimental Design and Randomization: A completely randomized design (CRD) was used for this experiment. Each of the 21 treatment combinations was randomly assigned across five replicates to ensure that any variation due to environmental or experimental conditions was minimized and that the effects measured could be attributed confidently to the treatment variables. Randomization was performed using R software to assign treatment combinations to replicate runs. This random allocation helps to prevent bias and maintains the independence of observations across replicates.

Response Variable: The primary response variable measured was static pressure (Pa), recorded at multiple points in the storage column using calibrated pressure sensors. The pressure values across all treatments ranged from –29.81 Pa to 102.76 Pa, capturing both pressure drop and buildup within the system under different temperature and humidity conditions.

4.2 Statistical analysis

Describe here your statistical analysis, including what type of ANOVA model you ran (based on the design above), what was your response variable, what were your explanatory variables and how were the explanatory variables treated (random or fixed). Provide your alpha level. Explain which function from which package you used to analyze this data. Explain how you checked linear model assumptions and whether or not they were met. Overall, make sure you explain in sufficient detail that, if given your data, a knowledgeable person would be able to reproduce your analysis exactly.

5 Results

Here is where the coding is going to happen, and it will be completely up to you. Include under this section as many sub-sections (using ##) and as many chunks needed to create the analytical workflow for your analysis, starting at loading packages and data, wrangling, EDA, modeling, assumptions checking, ANOVA table, means, pairwise comparisons, and final publication-quality plot.

Make sure to run a model that reflects your study design. Even if your study design does not include one of the designs covered in class, you are still expected to run the most appropriate model. If you need help for references, let me know.

Before each chunk, describe the steps being performed in that chunk. For example, “Here I will load the data”.

If a chunk produces output, like printing a data frame, statistical summary, a plot, ANOVA table, etc., make sure to write text interpreting what you see and how you can/will use that information to move forward to the next steps in the workflow.

5.1 a) Setup

Here is where we load the packages we will use.

#install.packages("broom")

# Loading packages
library(tidyverse) # for data wrangling and plotting
library(janitor) # to clean data
library(car) # for Anova function
library(broom) # for model residuals extraction
library(emmeans) # for model mean extraction
library(multcomp) # for pairwise comparison letter display

Here is were we will load the data

pre_df <- read.csv("../Data/Peanuts_dataset.csv")

pre_df
    trt rep Humidity... Temperature.C. Pressure.Pa.
1    19   3          80              5       -29.81
2    20   1          80             10       -26.65
3    16   4          75              5       -15.01
4    10   1          65              5       -12.27
5    19   1          80              5       -11.77
6    16   2          75              5       -10.78
7    19   2          80              5       -10.21
8    20   4          80             10        -6.17
9    17   1          75             10        -5.01
10   20   3          80             10        -3.91
11   20   5          80             10        -0.57
12   13   4          70              5         0.46
13   16   1          75              5         3.22
14   19   5          80              5         3.59
15   16   3          75              5         4.45
16   17   2          75             10         5.82
17   18   1          75             15         7.17
18   21   3          80             15         7.27
19   18   2          75             15         8.42
20   21   4          80             15         9.06
21   18   4          75             15         9.61
22   11   1          65             10        11.74
23   15   1          70             15        14.61
24   14   4          70             10        14.69
25   14   1          70             10        14.98
26   20   2          80             10        15.75
27   21   1          80             15        16.95
28   21   5          80             15        18.96
29   15   3          70             15        21.46
30   17   3          75             10        22.92
31   17   4          75             10        23.37
32   13   3          70              5        25.44
33   10   3          65              5        25.96
34   15   5          70             15        26.01
35    4   4          55              5        27.62
36   18   3          75             15        28.22
37    7   1          60              5        29.11
38   15   4          70             15        29.17
39   11   2          65             10        29.30
40   12   2          65             15        29.46
41   13   2          70              5        30.73
42   21   2          80             15        32.62
43   19   4          80              5        34.50
44   16   5          75              5        36.74
45    7   2          60              5        37.48
46   10   2          65              5        37.48
47    8   2          60             10        40.29
48   18   5          75             15        41.26
49   13   1          70              5        42.81
50   12   5          65             15        43.71
51    1   2          50              5        44.08
52    7   3          60              5        44.11
53   12   1          65             15        44.28
54    9   3          60             15        44.33
55   14   2          70             10        45.12
56   17   5          75             10        46.27
57    4   1          55              5        46.32
58    2   2          50             10        47.94
59   14   3          70             10        48.87
60   14   5          70             10        48.98
61   15   2          70             15        49.13
62   10   5          65              5        50.65
63   12   3          65             15        54.58
64    9   1          60             15        55.34
65    8   1          60             10        55.35
66    8   4          60             10        55.65
67    9   5          60             15        56.90
68   11   5          65             10        57.86
69   11   3          65             10        60.31
70    8   3          60             10        61.64
71   11   4          65             10        62.04
72    3   5          50             15        63.35
73   10   4          65              5        63.86
74    2   5          50             10        64.88
75    5   2          55             10        65.43
76    6   2          55             15        65.69
77    2   3          50             10        66.60
78    2   1          50             10        66.73
79    6   3          55             15        67.02
80    5   3          55             10        68.74
81    7   5          60              5        68.94
82    8   5          60             10        70.44
83    4   2          55              5        70.77
84    3   1          50             15        70.82
85   13   5          70              5        70.82
86    5   4          55             10        71.47
87    9   4          60             15        71.53
88    1   1          50              5        73.43
89    4   5          55              5        74.69
90    9   2          60             15        76.64
91    6   1          55             15        78.35
92    6   4          55             15        79.20
93    1   3          50              5        80.73
94   12   4          65             15        81.06
95    5   5          55             10        82.20
96    3   2          50             15        84.44
97    3   3          50             15        84.78
98    1   4          50              5        86.58
99    5   1          55             10        88.58
100   3   4          50             15        89.01
101   2   4          50             10        89.65
102   4   3          55              5        92.29
103   6   5          55             15        98.64
104   1   5          50              5        99.77
105   7   4          60              5       102.76

5.2 b) EDA tables

summary(pre_df)
      trt          rep     Humidity... Temperature.C.  Pressure.Pa.   
 Min.   : 1   Min.   :1   Min.   :50   Min.   : 5     Min.   :-29.81  
 1st Qu.: 6   1st Qu.:2   1st Qu.:55   1st Qu.: 5     1st Qu.: 16.95  
 Median :11   Median :3   Median :65   Median :10     Median : 44.28  
 Mean   :11   Mean   :3   Mean   :65   Mean   :10     Mean   : 42.28  
 3rd Qu.:16   3rd Qu.:4   3rd Qu.:75   3rd Qu.:15     3rd Qu.: 67.02  
 Max.   :21   Max.   :5   Max.   :80   Max.   :15     Max.   :102.76  
glimpse(pre_df)
Rows: 105
Columns: 5
$ trt            <int> 19, 20, 16, 10, 19, 16, 19, 20, 17, 20, 20, 13, 16, 19,…
$ rep            <int> 3, 1, 4, 1, 1, 2, 2, 4, 1, 3, 5, 4, 1, 5, 3, 2, 1, 3, 2…
$ Humidity...    <int> 80, 80, 75, 65, 80, 75, 80, 80, 75, 80, 80, 70, 75, 80,…
$ Temperature.C. <int> 5, 10, 5, 5, 5, 5, 5, 10, 10, 10, 10, 5, 5, 5, 5, 10, 1…
$ Pressure.Pa.   <dbl> -29.81, -26.65, -15.01, -12.27, -11.77, -10.78, -10.21,…

6 c) Wrangling

pre_dfc <-  pre_df %>% clean_names()

pre_dfc
    trt rep humidity temperature_c pressure_pa
1    19   3       80             5      -29.81
2    20   1       80            10      -26.65
3    16   4       75             5      -15.01
4    10   1       65             5      -12.27
5    19   1       80             5      -11.77
6    16   2       75             5      -10.78
7    19   2       80             5      -10.21
8    20   4       80            10       -6.17
9    17   1       75            10       -5.01
10   20   3       80            10       -3.91
11   20   5       80            10       -0.57
12   13   4       70             5        0.46
13   16   1       75             5        3.22
14   19   5       80             5        3.59
15   16   3       75             5        4.45
16   17   2       75            10        5.82
17   18   1       75            15        7.17
18   21   3       80            15        7.27
19   18   2       75            15        8.42
20   21   4       80            15        9.06
21   18   4       75            15        9.61
22   11   1       65            10       11.74
23   15   1       70            15       14.61
24   14   4       70            10       14.69
25   14   1       70            10       14.98
26   20   2       80            10       15.75
27   21   1       80            15       16.95
28   21   5       80            15       18.96
29   15   3       70            15       21.46
30   17   3       75            10       22.92
31   17   4       75            10       23.37
32   13   3       70             5       25.44
33   10   3       65             5       25.96
34   15   5       70            15       26.01
35    4   4       55             5       27.62
36   18   3       75            15       28.22
37    7   1       60             5       29.11
38   15   4       70            15       29.17
39   11   2       65            10       29.30
40   12   2       65            15       29.46
41   13   2       70             5       30.73
42   21   2       80            15       32.62
43   19   4       80             5       34.50
44   16   5       75             5       36.74
45    7   2       60             5       37.48
46   10   2       65             5       37.48
47    8   2       60            10       40.29
48   18   5       75            15       41.26
49   13   1       70             5       42.81
50   12   5       65            15       43.71
51    1   2       50             5       44.08
52    7   3       60             5       44.11
53   12   1       65            15       44.28
54    9   3       60            15       44.33
55   14   2       70            10       45.12
56   17   5       75            10       46.27
57    4   1       55             5       46.32
58    2   2       50            10       47.94
59   14   3       70            10       48.87
60   14   5       70            10       48.98
61   15   2       70            15       49.13
62   10   5       65             5       50.65
63   12   3       65            15       54.58
64    9   1       60            15       55.34
65    8   1       60            10       55.35
66    8   4       60            10       55.65
67    9   5       60            15       56.90
68   11   5       65            10       57.86
69   11   3       65            10       60.31
70    8   3       60            10       61.64
71   11   4       65            10       62.04
72    3   5       50            15       63.35
73   10   4       65             5       63.86
74    2   5       50            10       64.88
75    5   2       55            10       65.43
76    6   2       55            15       65.69
77    2   3       50            10       66.60
78    2   1       50            10       66.73
79    6   3       55            15       67.02
80    5   3       55            10       68.74
81    7   5       60             5       68.94
82    8   5       60            10       70.44
83    4   2       55             5       70.77
84    3   1       50            15       70.82
85   13   5       70             5       70.82
86    5   4       55            10       71.47
87    9   4       60            15       71.53
88    1   1       50             5       73.43
89    4   5       55             5       74.69
90    9   2       60            15       76.64
91    6   1       55            15       78.35
92    6   4       55            15       79.20
93    1   3       50             5       80.73
94   12   4       65            15       81.06
95    5   5       55            10       82.20
96    3   2       50            15       84.44
97    3   3       50            15       84.78
98    1   4       50             5       86.58
99    5   1       55            10       88.58
100   3   4       50            15       89.01
101   2   4       50            10       89.65
102   4   3       55             5       92.29
103   6   5       55            15       98.64
104   1   5       50             5       99.77
105   7   4       60             5      102.76

We converted our treatments (humidity and temperature) to factors

pre_dfw <- pre_dfc %>%
  mutate(rep = factor(rep),
         temperature_c = factor(temperature_c),
         humidity = factor(humidity)
         ) %>% 
  mutate(trtname = paste0(temperature_c, "+" , humidity))

pre_dfw
    trt rep humidity temperature_c pressure_pa trtname
1    19   3       80             5      -29.81    5+80
2    20   1       80            10      -26.65   10+80
3    16   4       75             5      -15.01    5+75
4    10   1       65             5      -12.27    5+65
5    19   1       80             5      -11.77    5+80
6    16   2       75             5      -10.78    5+75
7    19   2       80             5      -10.21    5+80
8    20   4       80            10       -6.17   10+80
9    17   1       75            10       -5.01   10+75
10   20   3       80            10       -3.91   10+80
11   20   5       80            10       -0.57   10+80
12   13   4       70             5        0.46    5+70
13   16   1       75             5        3.22    5+75
14   19   5       80             5        3.59    5+80
15   16   3       75             5        4.45    5+75
16   17   2       75            10        5.82   10+75
17   18   1       75            15        7.17   15+75
18   21   3       80            15        7.27   15+80
19   18   2       75            15        8.42   15+75
20   21   4       80            15        9.06   15+80
21   18   4       75            15        9.61   15+75
22   11   1       65            10       11.74   10+65
23   15   1       70            15       14.61   15+70
24   14   4       70            10       14.69   10+70
25   14   1       70            10       14.98   10+70
26   20   2       80            10       15.75   10+80
27   21   1       80            15       16.95   15+80
28   21   5       80            15       18.96   15+80
29   15   3       70            15       21.46   15+70
30   17   3       75            10       22.92   10+75
31   17   4       75            10       23.37   10+75
32   13   3       70             5       25.44    5+70
33   10   3       65             5       25.96    5+65
34   15   5       70            15       26.01   15+70
35    4   4       55             5       27.62    5+55
36   18   3       75            15       28.22   15+75
37    7   1       60             5       29.11    5+60
38   15   4       70            15       29.17   15+70
39   11   2       65            10       29.30   10+65
40   12   2       65            15       29.46   15+65
41   13   2       70             5       30.73    5+70
42   21   2       80            15       32.62   15+80
43   19   4       80             5       34.50    5+80
44   16   5       75             5       36.74    5+75
45    7   2       60             5       37.48    5+60
46   10   2       65             5       37.48    5+65
47    8   2       60            10       40.29   10+60
48   18   5       75            15       41.26   15+75
49   13   1       70             5       42.81    5+70
50   12   5       65            15       43.71   15+65
51    1   2       50             5       44.08    5+50
52    7   3       60             5       44.11    5+60
53   12   1       65            15       44.28   15+65
54    9   3       60            15       44.33   15+60
55   14   2       70            10       45.12   10+70
56   17   5       75            10       46.27   10+75
57    4   1       55             5       46.32    5+55
58    2   2       50            10       47.94   10+50
59   14   3       70            10       48.87   10+70
60   14   5       70            10       48.98   10+70
61   15   2       70            15       49.13   15+70
62   10   5       65             5       50.65    5+65
63   12   3       65            15       54.58   15+65
64    9   1       60            15       55.34   15+60
65    8   1       60            10       55.35   10+60
66    8   4       60            10       55.65   10+60
67    9   5       60            15       56.90   15+60
68   11   5       65            10       57.86   10+65
69   11   3       65            10       60.31   10+65
70    8   3       60            10       61.64   10+60
71   11   4       65            10       62.04   10+65
72    3   5       50            15       63.35   15+50
73   10   4       65             5       63.86    5+65
74    2   5       50            10       64.88   10+50
75    5   2       55            10       65.43   10+55
76    6   2       55            15       65.69   15+55
77    2   3       50            10       66.60   10+50
78    2   1       50            10       66.73   10+50
79    6   3       55            15       67.02   15+55
80    5   3       55            10       68.74   10+55
81    7   5       60             5       68.94    5+60
82    8   5       60            10       70.44   10+60
83    4   2       55             5       70.77    5+55
84    3   1       50            15       70.82   15+50
85   13   5       70             5       70.82    5+70
86    5   4       55            10       71.47   10+55
87    9   4       60            15       71.53   15+60
88    1   1       50             5       73.43    5+50
89    4   5       55             5       74.69    5+55
90    9   2       60            15       76.64   15+60
91    6   1       55            15       78.35   15+55
92    6   4       55            15       79.20   15+55
93    1   3       50             5       80.73    5+50
94   12   4       65            15       81.06   15+65
95    5   5       55            10       82.20   10+55
96    3   2       50            15       84.44   15+50
97    3   3       50            15       84.78   15+50
98    1   4       50             5       86.58    5+50
99    5   1       55            10       88.58   10+55
100   3   4       50            15       89.01   15+50
101   2   4       50            10       89.65   10+50
102   4   3       55             5       92.29    5+55
103   6   5       55            15       98.64   15+55
104   1   5       50             5       99.77    5+50
105   7   4       60             5      102.76    5+60
summary(pre_dfw)
      trt     rep    humidity temperature_c  pressure_pa       trtname         
 Min.   : 1   1:21   50:15    5 :35         Min.   :-29.81   Length:105        
 1st Qu.: 6   2:21   55:15    10:35         1st Qu.: 16.95   Class :character  
 Median :11   3:21   60:15    15:35         Median : 44.28   Mode  :character  
 Mean   :11   4:21   65:15                  Mean   : 42.28                     
 3rd Qu.:16   5:21   70:15                  3rd Qu.: 67.02                     
 Max.   :21          75:15                  Max.   :102.76                     
                     80:15                                                     

6.1 d) EDA plots

Here we visualize the effect of temperature on pressure

ggplot(pre_dfw, aes(x = temperature_c, 
                    y = pressure_pa,
                    color = temperature_c)) +
  geom_boxplot() +
  geom_jitter() +
  theme(legend.position = "none")

6.1.1 Interpretation

From the boxplot, the pressure distribution in the peanut storage tote seem to be higher at a temperature of 5°C, which decreased at the 10°C and increased again at 15°C. However, we can not clearly state the significant effect of temperature on the pressure distribution in the stored peanut without mean separations.

Here we visualize the effect of humidity on pressure

ggplot(pre_dfw, aes(x = humidity, 
                    y = pressure_pa,
                    color = humidity)) +
  geom_boxplot() +
  geom_jitter() +
  theme(legend.position = "none")

6.1.2 Interpretation

For the effect of relative humidity on pressure distribution in stored peanuts, we can see a trend of decreasing pressure with increasing relative humidity. That is, as the the relative humidity in stored peanut increases, the pressure distribution in the stored peanut decreases. The significance of the humidity on pressure in stored peanut will be discernible after Anova.

Here we visualize the effect of the interaction between temperature and humidity on pressure

ggplot(pre_dfw, aes(x = temperature_c,
                   y = pressure_pa,
                   color = temperature_c
                   )) + 
  geom_boxplot() +
  geom_jitter() +
  facet_grid(.~humidity)#+

  theme(legend.position = "none")
List of 1
 $ legend.position: chr "none"
 - attr(*, "class")= chr [1:2] "theme" "gg"
 - attr(*, "complete")= logi FALSE
 - attr(*, "validate")= logi TRUE

6.1.3 Interpretation

There seem to be no interaction effect of temperature and relative humidity on pressure distribution in stored peanut. There is a general trend of decreasing pressure with increasing relative humidity irrespective of the temperature. However, there is no particular trend in pressure with regards to temperature under different relative humidity treatments.

7 e) Statistical model

7.1 Set-to-zero vs. sum-to-zero

Below we change the default to sum-to-zero (“contr.sum”) before fitting the model.

# Changing to sum-to-zero contrast
options(contrasts = c("contr.sum", "contr.poly"))

# Model fitting
crd_mod <- lm(pressure_pa ~ temperature_c + humidity +
                temperature_c: humidity,
              data = pre_dfw)

# Summary
summary(crd_mod)

Call:
lm(formula = pressure_pa ~ temperature_c + humidity + temperature_c:humidity, 
    data = pre_dfw)

Residuals:
    Min      1Q  Median      3Q     Max 
-45.406 -10.760  -0.504  10.592  46.280 

Coefficients:
                         Estimate Std. Error t value Pr(>|t|)    
(Intercept)               42.2847     1.8658  22.663  < 2e-16 ***
temperature_c1            -4.5835     2.6387  -1.737 0.086046 .  
temperature_c2            -0.5332     2.6387  -0.202 0.840340    
humidity1                 31.9013     4.5704   6.980 6.37e-10 ***
humidity2                 29.5160     4.5704   6.458 6.53e-09 ***
humidity3                 15.7493     4.5704   3.446 0.000891 ***
humidity4                  0.3833     4.5704   0.084 0.933356    
humidity5                -10.0660     4.5704  -2.202 0.030373 *  
humidity6                -28.5067     4.5704  -6.237 1.72e-08 ***
temperature_c1:humidity1   7.3155     6.4635   1.132 0.260928    
temperature_c2:humidity1  -6.4928     6.4635  -1.005 0.318007    
temperature_c1:humidity2  -4.8791     6.4635  -0.755 0.452432    
temperature_c2:humidity2   4.0166     6.4635   0.621 0.536002    
temperature_c1:humidity3   3.0295     6.4635   0.469 0.640487    
temperature_c2:humidity3  -0.8268     6.4635  -0.128 0.898523    
temperature_c1:humidity4  -4.9485     6.4635  -0.766 0.446056    
temperature_c2:humidity4   2.1152     6.4635   0.327 0.744285    
temperature_c1:humidity5   6.4169     6.4635   0.993 0.323664    
temperature_c2:humidity5   2.8426     6.4635   0.440 0.661218    
temperature_c1:humidity6  -5.4705     6.4635  -0.846 0.399752    
temperature_c2:humidity6   5.4292     6.4635   0.840 0.403299    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 19.12 on 84 degrees of freedom
Multiple R-squared:  0.7032,    Adjusted R-squared:  0.6326 
F-statistic: 9.952 on 20 and 84 DF,  p-value: 9.744e-15

8 f) ANOVA table

The Anova() function allows us to use type 3 sum of squares.

Anova(crd_mod, type =3)
Anova Table (Type III tests)

Response: pressure_pa
                       Sum Sq Df  F value Pr(>F)    
(Intercept)            187739  1 513.5897 <2e-16 ***
temperature_c            1662  2   2.2728 0.1093    
humidity                68554  6  31.2567 <2e-16 ***
temperature_c:humidity   2543 12   0.5796 0.8528    
Residuals               30706 84                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

9 Interpretation

The main of effect temperature as well as the interaction between temperature and humidity did not significantly affect pressure in stored peanut. However, In our case, we have a significant effect of relative humidity of pressure, and the interaction between temperature and humidity are not significant. We will only focus on relative humidity since it is the only treatment that is significant.

10 g) Linear model assumptions

10.1 Extracting residuals

First, let’s extract our model residuals, and also create studentized residuals.

crd_resid <- augment(crd_mod) %>% 
  mutate(.studresid = rstudent(crd_mod))

crd_resid
# A tibble: 105 × 10
   pressure_pa temperature_c humidity .fitted  .resid  .hat .sigma    .cooksd
         <dbl> <fct>         <fct>      <dbl>   <dbl> <dbl>  <dbl>      <dbl>
 1      -29.8  5             80         -2.74 -27.1   0.200   18.9 0.0298    
 2      -26.6  10            80         -4.31 -22.3   0.2     19.0 0.0203    
 3      -15.0  5             75          3.72 -18.7   0.200   19.1 0.0143    
 4      -12.3  5             65         33.1  -45.4   0.200   18.4 0.0839    
 5      -11.8  5             80         -2.74  -9.03  0.200   19.2 0.00332   
 6      -10.8  5             75          3.72 -14.5   0.200   19.2 0.00856   
 7      -10.2  5             80         -2.74  -7.47  0.2     19.2 0.00227   
 8       -6.17 10            80         -4.31  -1.86  0.200   19.2 0.000141  
 9       -5.01 10            75         18.7  -23.7   0.200   19.0 0.0228    
10       -3.91 10            80         -4.31   0.400 0.200   19.2 0.00000651
# ℹ 95 more rows
# ℹ 2 more variables: .std.resid <dbl>, .studresid <dbl>

10.2 Residual independence

Here we are checking for the residual independence

ggplot(crd_resid, aes(x = .fitted, y = .studresid)) + 
  
  geom_point(shape = 21,
             fill = "purple",
             size = 3,
             alpha = 0.7) +
  geom_hline(yintercept = c(-3,0,3), color = "red") +
  geom_smooth()
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'

11 Interpretation

The residual looks normally distributed as there is no clear pattern in the plot.

11.1 Residual homoscedasticity

Here we are checking for the residual homoscedasticity

ggplot(crd_resid, aes(x=.fitted, y=.studresid))+
  geom_hline(yintercept = 0, color="red")+
  geom_point(shape = 21,
             fill = "purple", 
             size = 3,
             alpha = .7)+
  geom_smooth()+
  geom_hline(yintercept = c(-3,3), color = "red")+
  theme_bw()
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'

11.2 Interpretation

The residual looks normally distributed as there is homogeneity of variance in the plot.

11.3 Residual normality

Here we are checking for the residual homoscedasticity

ggplot(crd_resid, aes(sample = .studresid)) +
  stat_qq() + 
  stat_qq_line()

11.4 Interpretation

The residual looks normally distributed.

ggplot(crd_resid, aes(x = .studresid)) +
  geom_density() +
  scale_x_continuous(breaks = c(-3,0,3), limits = c(-3,3))

11.5 Residual outliers

  • For this, we use the fitted vs. residual plot.
ggplot(crd_resid, aes(x=.fitted, y=.studresid))+
  geom_hline(yintercept = 0, color="red")+
  geom_point(shape = 21,
             fill = "purple", 
             size = 3,
             alpha = .7)+
  geom_smooth()+
  geom_hline(yintercept = c(-3,3), color = "red")+
  theme_bw()
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'

All residuals are within the [-3, 3] interval, so nothing to worry here.
Now that model assumptions have been checked and met, we can proceed to using the model for inference.

12 h) Model means

The next step in the workflow is extracting the model means.

crd_means_humid <- emmeans(crd_mod, ~humidity)
NOTE: Results may be misleading due to involvement in interactions
crd_means_humid
 humidity emmean   SE df lower.CL upper.CL
 50        74.19 4.94 84    64.37     84.0
 55        71.80 4.94 84    61.98     81.6
 60        58.03 4.94 84    48.22     67.9
 65        42.67 4.94 84    32.85     52.5
 70        32.22 4.94 84    22.40     42.0
 75        13.78 4.94 84     3.96     23.6
 80         3.31 4.94 84    -6.51     13.1

Results are averaged over the levels of: temperature_c 
Confidence level used: 0.95 

13 i) Pairwise comparisons

Now that we extracted means, we then perform pairwise comparisons among them.

crd_cld_humid <- cld(crd_means_humid,
                   reversed = T,  
                   Letters = letters,  
                   adjust = "none" 
                   )


crd_cld_humid
 humidity emmean   SE df lower.CL upper.CL .group
 50        74.19 4.94 84    64.37     84.0  a    
 55        71.80 4.94 84    61.98     81.6  ab   
 60        58.03 4.94 84    48.22     67.9   b   
 65        42.67 4.94 84    32.85     52.5    c  
 70        32.22 4.94 84    22.40     42.0    c  
 75        13.78 4.94 84     3.96     23.6     d 
 80         3.31 4.94 84    -6.51     13.1     d 

Results are averaged over the levels of: temperature_c 
Confidence level used: 0.95 
significance level used: alpha = 0.05 
NOTE: If two or more means share the same grouping symbol,
      then we cannot show them to be different.
      But we also did not show them to be the same. 

13.1 Interpretation

Since only humidity was significant, pairwise comparison and data interpretation was performed on only the main effect of humidity of pressure. From the pairwise comparison, we can see that there was no significant difference between humidity at 50 and 55 and between 55 and 60. Furthermore, no significant difference was observed between relative humidity at 65 and 70, and between 75 and 80. In summary, the pressure in the stored peanut decreased with increasing relative humidity.

Here we tested the hypothesis of everything compared to everything else, which corresponds to our first method using :.

crd_cld_selected <- crd_cld_humid %>%
  as.data.frame() %>%
  mutate(letter = trimws(.group)) %>%
  mutate(trtname = paste0(humidity))

crd_cld_selected
  humidity    emmean       SE df  lower.CL upper.CL .group letter trtname
7       50 74.186000 4.936553 84 64.369124 84.00288   a         a      50
6       55 71.800667 4.936553 84 61.983790 81.61754   ab       ab      55
5       60 58.034000 4.936553 84 48.217124 67.85088    b        b      60
4       65 42.668000 4.936553 84 32.851124 52.48488     c       c      65
3       70 32.218667 4.936553 84 22.401790 42.03554     c       c      70
2       75 13.778000 4.936553 84  3.961124 23.59488      d      d      75
1       80  3.307333 4.936553 84 -6.509543 13.12421      d      d      80

13.2 g) Final plot

Let’s plot our results, including both raw data (for allowing our audience to inspect data distribution) and statistical model summary (i.e., letter separation) for inference purposes.

final_plot <- ggplot() +
  geom_boxplot(data = pre_dfw, aes(x = humidity, y = pressure_pa, color = humidity)) +
  geom_label(data = crd_cld_selected, aes(x = humidity, y = emmean, label = letter)) +
  labs(title = "Pressure across Humidity Levels",
       x = "Humidity",
       y = "Pressure (Pa)") +
  theme_minimal()

final_plot

14 Conclution

In conclusion, the interaction between temperature and relative humidity did not significantly affect the internal microclimate of peanut storage units (pressure distribution) which is in contrast with our alternative hypothesis, hence we fail to reject our null hypothesis.

14.1 k) # Save the plot

ggsave("../output/interactionWithComparasion_pressure.png", plot = final_plot, width = 8, height = 5)

15 Team work in GitHub

Whether you are working with your future-self or as duos, make sure to stage, commit, and push after finishing each of the sub-sections above. When committing, write commit messages that are short and descriptive (e.g., finished wrangling).

If you are working in duos, make sure to split the workload. I will check your collaborative work through the commit history, and if one student has contributed significantly more than the other, than that will impact grades.

Tip 1: to avoid merge conflicts, make sure to pull first, and then start working locally. That will ensure that any changes made by your partner will be “downloaded” before you make changes to the files locally.

Tip 2: make use of the Issues on this repository to set up to-do lists and assign tasks to different people. You can also use each issue/task to discuss how things should be run and get to an agreement.

16 Submitting your work

Once you have developed all the code and answers, make sure to Render this quarto file.

Notes on rendering:

  • Make sure to render your work and inspect how the final html look like.
  • If it does not look professional for whatever reason, then fix the issue, re-render it, recheck.
  • Only send me your work once your html file looks professional.
  • Some potential issues you may encounter and how to fix them:
    • Some times your code may be creating a table output that is waaay to long and cumbersome to scroll through when rendered. If this is the case, make it more professional looking by using the head() function to only print the first handful of rows (instead of thousands of rows).

    • DO NOT delete the file’s heading levels (# and ##). They set up the proper heading 1 and 2 levels, and I use them to guide my grading.

    • If a given chunk is also outputting warnings or messages, inhibit this behavior by changing the chunk options message and warning to FALSE.

    • If, after rendered, 2 lines of text are connected and you wish to “break line” between them, add 2 extra spaces after the first one.

After rendering, an .html file will be created on your code folder.

Rename this file to LASTNAME1-LASTNAME2_finalproject.html.
For ex., Bastos-Mendes_finalproject.html.

Submit the html file on eLC by April 30th 11:59 pm.